c-lasso - a Python package for constrained sparse and robust regression and classification
نویسندگان
چکیده
We introduce c-lasso, a Python package that enables sparse and robust linear regression classification with equality constraints. The underlying statistical forward model is assumed to be of the following form: \[ y = X \beta + \sigma \epsilon \qquad \textrm{subject to} C\beta=0 \] Here, $X \in \mathbb{R}^{n\times d}$is given design matrix vector $y \mathbb{R}^{n}$ continuous or binary response vector. $C$ general constraint matrix. $\beta \mathbb{R}^{d}$ contains unknown coefficients $\sigma$ an scale. Prominent use cases are (sparse) log-contrast compositional data $X$, requiring $1_d^T 0$ (Aitchion Bacon-Shone 1984) Generalized Lasso which special case described problem (see, e.g, (James, Paulson, Rusmevichientong 2020), Example 3). c-lasso provides estimators for inferring scale (i.e., perspective M-estimators (Combettes Muller 2020a)) form \min_{\beta \mathbb{R}^d, \mathbb{R}_{0}} f\left(X\beta - y,{\sigma} \right) \lambda \left\lVert \beta\right\rVert_1 C\beta 0 several convex loss functions $f(\cdot,\cdot)$. This includes constrained Lasso, scaled Huber
منابع مشابه
Robust Estimation in Linear Regression with Molticollinearity and Sparse Models
One of the factors affecting the statistical analysis of the data is the presence of outliers. The methods which are not affected by the outliers are called robust methods. Robust regression methods are robust estimation methods of regression model parameters in the presence of outliers. Besides outliers, the linear dependency of regressor variables, which is called multicollinearity...
متن کاملthe innovation of a statistical model to estimate dependable rainfall (dr) and develop it for determination and classification of drought and wet years of iran
آب حاصل از بارش منبع تأمین نیازهای بی شمار جانداران به ویژه انسان است و هرگونه کاهش در کم و کیف آن مستقیماً حیات موجودات زنده را تحت تأثیر منفی قرار می دهد. نوسان سال به سال بارش از ویژگی های اساسی و بسیار مهم بارش های سالانه ایران محسوب می شود که آثار زیان بار آن در تمام عرصه های اقتصادی، اجتماعی و حتی سیاسی- امنیتی به نحوی منعکس می شود. چون میزان آب ناشی از بارش یکی از مولفه های اصلی برنامه ...
15 صفحه اولpyGPs: a Python library for Gaussian process regression and classification
We introduce pyGPs, an object-oriented implementation of Gaussian processes (gps) for machine learning. The library provides a wide range of functionalities reaching from simple gp specification via mean and covariance and gp inference to more complex implementations of hyperparameter optimization, sparse approximations, and graph based learning. Using Python we focus on usability for both “use...
متن کاملQuantile regression with group lasso for classification
Applications of regression models for binary response are very common and models specific to these problems are widely used. Quantile regression for binary response data has recently attracted attention and regularized quantile regression methods have been proposed for high dimensional problems. When the predictors have a natural group structure, such as in the case of categorical predictors co...
متن کاملRobust and sparse bridge regression
It is known that when there are heavy-tailed errors or outliers in the response, the least squares methods may fail to produce a reliable estimator. In this paper, we proposed a generalized Huber criterion which is highly flexible and robust for large errors. We applied the new criterion to the bridge regression family, called robust and sparse bridge regression (RSBR). However, to get the RSBR...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of open source software
سال: 2021
ISSN: ['2475-9066']
DOI: https://doi.org/10.21105/joss.02844